Search Result

Journals

Publication Years

Keywords

Please wait a minute...

For Selected:

Download Citations
EndNote Ris BibTeX

Toggle Thumbnails

Select

Cross-lingual zero-resource named entity recognition model based on sentence-level generative adversarial network

Xiaoyan ZHANG, Zhengyu DUAN

Journal of Computer Applications 2023, 43 (8): 2406-2411. DOI: 10.11772/j.issn.1001-9081.2022071124

Abstract （296）

HTML （17）

PDF （963KB）（182）

Save

To address the problem of lack of labeled data in low-resource languages， which prevents the use of existing mature deep learning methods for Named Entity Recognition （NER）， a cross-lingual NER model based on sentence-level Generative Adversarial Network （GAN）， namely SLGAN-XLM-R （Sentence Level GAN based on XLM-R）， was proposed. Firstly， the labeled data of the source language was used to train the NER model on the basis of the pre-trained model XLM-R （XLM-Robustly optimized BERT pretraining approach）. At the same time， the linguistic adversarial training was performed on the embedding layer of XLM-R model by combining the unlabeled data of the target language. Then， the soft labels of the unlabeled data of the target language were predicted by using the NER model， Finally the labeled data of the source language and the target language was mixed to fine-tune the model again to obtain the final NER model. Experiments were conducted on four languages， English， German， Spanish， and Dutch， in two datasets， CoNLL2002 and CoNLL2003. The results show that with English as the source language， the F1 scores of SLGAN-XLM-R model on the test sets of German， Spanish， and Dutch are 72.70%， 79.42%， and 80.03%， respectively， which are 5.38， 5.38， and 3.05 percentage points higher compared to those of the direct fine-tuning on XLM-R model.

Table and Figures | Reference | Related Articles | Metrics